You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version
Dictionary Based Sentiment (Documents) (Operator Toolbox)
Synopsis
This operator creates a sentiment model from an annotated list of words. It expects a list of words with a weight (value) attribute describing how negative or positive a word is. This can later be used in an Apply Dictionary Based Sentiment operator to score tokenized documents. This operator takes a list of Key (nominal) Value (numerical) pairs and translates them into an applicable model.
This operator also supports negations. It is possible to add a negation dictionary to the neg port. If a negation is x words before any given word, its weights will be inverted. x can be defined with the negation window size parameter. It is possible to define the strength of negation using the negation strength option. If this setting is left empty, weights of 1 are used. Note that negation weights are defined as positive numbers.
Additionally to the concept of negativity this operator supports also intensifiers. Intensifier are words which increase or decrease the sentiment of another word. This means you can enhance the sentiment of the word ''good'' with a word like ''very''. If the corresponding weight is smaller than 1 than this is a ''de-intensifier''. An example would be the word ''relatively''. If words are already negated, intensifiers are ignored.
Input
- exa (Data Table)
Input ExampleSet with a Key and a Value Attribute.
- neg (Data Table)
Example Set providing a list of negation words like "not" and their weight.
- int (Data Table)
Example Set providing a list of intensifiers like "very" and their weight.
Output
- mod
The resulting model.
- ori (Data Table)
The passed through input ExampleSet.
Parameters
- key_attribute The attribute with the key (word) in it . Range:
- value_attribute The attribute with the value (score) in it . Range:
- negation_attribute Attribute in the negation ExampleSet which holds the individual words. Range:
- negation_strength Attribute in the negation ExampleSet which holds the weight of negation. Weights are defined as positive numbers! Range:
- negation_window_size Window size for negation. 1 means that the word needs to be directly in front of other words. Range:
- use_symmetric_negation_window If set to true, negation will not just be applied to succeeding, but also to proceeding tokens. This covers sentences like "The crisis is not as bad as forecasted", where you maybe want to invert both crisis and bad. Range:
- use_intensifier If set to true, you can define intensifiers Range:
- intensifier_word Attribute in the intensifier ExampleSet which holds the individual words. Range:
- intensifier_value Attribute in the negation ExampleSet which holds the weight of intensifier. Weights are defined as positive numbers! Range:
- intensifier_window_size Window size for intensifier. 1 means that the word needs to be directly in front of other words. Range:
- use_symmetric_intensifier_window If set to true, intensifier will not just be applied to succeeding, but also to proceeding tokens. Range:
Tutorial Processes
Generate Dictionary Model on Dummy Data
Generate a dummy dictionary with just two words (good and bad with values 1 and -1) and generate a model based on this data.
Generate Dictionary Model on Dummy Data with Negation
Generate a dummy dictionary with just two words (good and bad with values 1 and -1) and generate a model based on this data. Additionally a negation dictionary is provided. All words with a preceding "not" are inverted in their weights. The model is applied on three documents to display the effect of negations.